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Abstract 

A unitary operator U = k u j,k\k)U\ is called diagonal when 
uj it = unless j = k. The definition extends to quantum compu- 
tations, where j and k vary over the 2" binary expressions for in- 
tegers 0, 1 • • • ,2" — 1, given n qubits. Such operators do not affect 
outcomes of the projective measurement ; < j < 2" — 1} 
but rather create arbitrary relative phases among the computa- 
tional basis states {\j) ; < j < 2" — 1}. These relative phases 
are often required in applications. 

Constructing quantum circuits for diagonal computations us- 
ing standard techniques requires either 0(n 2 2") controlled-not 
gates and one-qubit Bloch sphere rotations or else 0(n2 n ) such 
gates and a work qubit. This work provides a recursive, construc- 
tive procedure which inputs the matrix coefficients of U and out- 
puts such a diagram containing 2" +1 — 3 alternating controlled- 
not gates and one-qubit z-axis Bloch sphere rotations. Up to a 
factor of two, these circuits are the smallest possible. Moreover, 
should the computation U be a tensor of diagonal one-qubit com- 
putations of the form R z (a) = e~ ta / 2 |0}(0| + e ta/2 |l)(l|, then a 
cancellation of controlled-not gates reduces our circuit to that of 
an n-qubit tensor. 



1 Introduction 

Let U(N) ~ {V anN xN matrix ; VV* = 1}, where 1 is an iden- 
tity matrix and V* = V' is the mathematical notation for the 
adjoint. One may view U(N) as the set of all reversible quan- 
tum computations acting on n qubits. Then our usual conven- 
tion is that algorithms for quantum circuit synthesis input such 
a V E U (N) and output a quantum circuit diagram for V, up to 
global phase. Several distinct quantum circuits may realize the 
same computation V. Thus, one seeks circuits for which the total 
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number of gates is small. This work focuses on the case where 
the input computation is diagonal. 

Gate counts for quantum circuits are often made in terms of 
basic gates [1], i.e., the set of all controlled-not gates and one- 
qubit computations. Our gate counts will be made with respect 
to the following gate library. We refer to elements as elementary 
gates, in contrast to basic gates. 

1. For 1 < j < n, apply R y (Q) G U(2 ) on line j, where 
R y (Q) = cos | |0)(0| +sin| |0)(1| 



-sinf |l)(0|+cos§ |1}(1|. 0<e<27t 

(1) 

is a y-axis Bloch sphere rotation [12, §4.2], 

2. For 1 < j < n, apply R z (a) 6 U(2 1 ) on line j, where 

R z (a) = e -'' a / 2 |0)(0|+e ,,a / 2 |l)(l|, < 9 < 2% (2) 
is a z-axis Bloch sphere rotation [12, §4.2]. 

3. Let 1 < j,k < n, let b\,bz,--- ,b n be « variables varying in 
the field of two elements F2, and let x,y 1— > x (By denote the 
exclusive-or (X0R) operator which is addition in F2. The fi- 
nal type of elementary gate is the y'-controlled-not gate act- 
ing on line k. We denote it by C k -. In case j < k, 



C k = 



0<bv-b„<N-i 



■■■■{bj®b k )---b n ){bi---b r --b k ---b n \ 

(3) 

The other case k < j is similar. 
The elementary gate library is universal because any V E U(N) 
factors into basic gates [ 1 ] and any one-qubit computation W can 
be decomposed into W = e'*i?_ v (0i)/?-(a)i?_ v (e 2 ) for e"* an un- 
measurable global phase [I, Lemma4.1] [12, §4.2]. Moreover, 
the asymptotics £l(— ), 0(— ), and ©(— ) of the counts in either 
gate library are identical, since every elementary gate is basic 
while every basic gate factors into at most three elementary gates. 

We next set some conventions. Throughout, U is a diagonal 
quantum computation on n qubits. Thus, for = 2", U acts on 
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the n-qubit state space which is the C span of the computational 
basis {\j) ;0<j <N—l}. The j are typically written as binary 
integers. As U is diagonal, U = Y!J=o u j\j) U\- Moreover, U 
unitary implies \uj\ 2 = 1. 

We denote the Lie group [9] of all diagonal computations on 
n-qubit states by A. The notation A (n) may be used for emphasis. 
Observe that A is abelian, i.e., commutative. 

Circuit synthesis algorithms that provably produce minimal 
gate counts are rare, difficult to construct, and have been pub- 
lished for special cases only [14]. Before stating our main result, 
we formalize a sense in which it is best possible. 

Definition 1.1 Let H cU (N) be an analytic subgroup. An n- 
qubit quantum circuit synthesis algorithm with inputs restricted 
to H is said to stably output to H iff i) it outputs at most a 
countably infinite number of quantum circuit topologies con- 
taining only elementary gates as inputs are varied over all of H 
and ii) for each such circuit topology x, the corresponding com- 
putations remain in H for every variation of parameter on any 
R y (0), R z (&) gate within x. If q is such a synthesis algorithm 
accepting any input from H and outputting stably to H, we put 
#q = max{#x ; x is a diagram output by C,}, where #x refers to the 
number of elementary gates in x. We finally put 

1(H) = min{#<; ; q outputs stably to H } (4) 



Definition 1.2 Consider now a family {H(n) C U(2 n ) ; n > 1} 
of analytic subgroups [ : , p. 47]. A family of n-qubit synthesis al- 
gorithms {q(n) ; n > 1}, each allowing for any input in H (n) and 
outputting stably to H(n), will be called stably asymptotically 
optimal iff #q(n) G 0(l[H(n)\). 

Theorem 1.3 Any n-qubit diagonal computation U E A(n) may 
be realized by a quantum circuit holding 2" +1 — 3 alternating 
controlled-not gates and z-axis Block sphere rotations R z (9). The 
construction is stably asymptotically optimal for A(n). 

Remark 1.4 Two other comments should be made about the 
construction. First, it requires neither a work qubit [ I ] nor any 
R y (Q) elementary gates. Second, should a n-qubit tensor of the 
form ®" =l R z (a,j) be input to the algorithm, the output will hold 
several cancelling controlled-not gates. After cancellation, the 
output will match the input. O 

As a benchmark, we describe in Section 2 a diagram for a 
given diagonal U using standard techniques. The technique 
hinges on a well-known circuit diagram for an (n — 1) -qubit con- 
trolled element of A(l). In the presence of one ancilla (work) 
qubit, this diagram holds 0(n2 n ) basic or equivalently elemen- 
tary gates. The cost rises to 0(n 2 2") when there is no ancilla 
qubit. Thus, the asymptotic cost of 0(2") of the synthesis algo- 
rithm of the Theorem (see Section 4) compares favorably with 
known results. Moreover, dimension counts during the argument 



for stably asymptotically optimal will make clear that synthesiz- 
ing large subsets of A requires > 2" — 1 elementary gates. In 
this specialized sense, £1(2") gates are required, and the diagram 
of Section 4 proves that diagrams for generic diagonal computa- 
tions cost ©(2") elementary (or basic) gates. 

See Figure 1 for the overall circuit topology in the case n = 3 
qubits. We defer a description of the algorithm for computing the 
R z angles to the body and next discuss potential applications. 

The first application is in conjunction with the standard syn- 
thesis algorithm [1, 3] [12, §4.5], which may be formalized using 
the QR matrix decomposition [4, 3]. For V G U(N), the algo- 
rithm uses a matrix factorization V = QR, where Q is a product 
of Givens rotations [ ] realizable as (n — 1) -controlled one-qubit 
computations and R is diagonal. Should the projective measure- 
ment { (j | ; < j < N — 1 } follow V, one need not apply R. 

Consider instead the following situation. For p << n, 
a desired computation V G U (N) is known to arise from 
Vi,V2, ■ ■ ■ ,V n -p+i G U (2 P ) as follows. First V\ is applied on lines 
1,2, •• • ,p, after which Vi is applied on lines 2, 3, • • • ,p + l, and 
so on until finally V n - p +\ is applied on lines n— p+l,n — p + 
2, • • - n. If quantum computing technology has progressed so that 
0(np2 p ) elementary gates may be realized directly, one may fac- 
tor each V\ = Q{R\, V 2 = Q2R2, ■ ■ ■ V n - P +\ = Qn- P +\Rn- P +\ and 
apply the standard synthesis algorithm on each subblock. How- 
ever, with the convention that { (j | ; < j < N — 1 } is only ap- 
plied after the entire computation V, we now need quantum cir- 
cuits realizing each of the R\ ,^2, ■ ■ ■ ,R n - p +i- The synthesis al- 
gorithms proposed in this paper provide these. Moreover, note 
that essential part of the argument is merely the overlap of the 
smaller blocks, not their pattern. 

Two further instances commonly arise where one needs to be 
careful about relative phases of computational basis states. 

• Suppose that for V G U (2" _1 ), one wishes to build a circuit 
for the computation (1©V) G 1/(2") whichapplies V iff the 
top line carries 1 1). Suppose one has a circuit for V, correct 
up to relative phase. For example, such results from the fac- 
torization of Q into Givens rotations using V = QR [3]. A 
straightforward approach is to condition every gate in Q, so 
that e.g. conditioned-not gates in Q correspond to Toffoli 
computations in 1 © Q. Yet 1®R will affect measurements 
in the n-qubit computational basis, unlike the diagonal com- 
putation R in (n — 1) qubits. One even needs a conditioned 
gate for the global phase of the original V. 

• Moreover, circuits for diagonal computations are required 
whenever the final projective measurement [12, §2.2.5] is 
not{(;|;0<;<iV-l}. 

Another possible application of circuits for diagonal quan- 
tum computations is to reduce the synthesis of arbitrary quan- 
tum computations to the synthesis of real quantum computa- 
tions [ 3], i.e., of those V G 0(N) = {V G U(N) ; V = V}. For 
there is a matrix decomposition U(N) = 0(N) A 0(N). Indeed, 



2 



u 













R- 




















R- 


< 


> 






R- 


< 




R- 


< 











































Figure 1 : This diagram shows the circuit structure realizing a three qubit diagonal computation using our circuit synthesis algorithm. 
The general algorithm applies in n rather than merely three qubits and extends the construction of Section 2.2 of a previous work 
[2]. Should the input diagonal be of the form U = R z (ct\) (g)i? z ((X2) <E)R z (cti), the second, third, fourth, and sixth R z gates of the 
output diagram are trivial, implying that all controlled-not gates cancel. The output then coincides with the input. 



this is a special case of the KAK metadecomposition [9, 8, 2]. 
Thus if V € U(N) is arbitrary, we may write V = 0\UO% for 
0\,C>2 £ 0(N) real quantum computations and U G A(«). The 
present work produces a circuit for U G A(n), reducing the ques- 
tion of a circuit for V G U(N) to circuits for 0\ , O2 G 0(N). 

Finally, we expect further applications to other quantum cir- 
cuit synthesis algorithms relying on other examples of the 
KAK matrix metadecomposition. Another such example is the 
Cosine-Sine decomposition 17]. This decomposition states 
that one may write any V G U(N) as V = (U\ © E/ 2 )W(E/ 3 © t/ 4 ) 
for Ui,U2, U3, t/4 G U(N/2) and W a sparse matix whose nonzero 
entries are paired cosines and sines. A quantum circuit for the 
matrix W may be synthesized using the algorithm of this pa- 
per. Indeed, let S = 0)(0| and H denote the Hadamard 
gate, costing one and two elementary gates respectively [2]. 
Then for 1 an (N/2) x (N/2) identity matrix, one may com- 
pute that U = [SH <E>1]W[(SH)* © 1] G A is a diagonal compu- 
tation. Hence, one may implement the nonrecursive portion of 
Cosine-Sine synthesis using the methods of this paper and 
six extra elementary gates. 

We briefly outline the body of the paper. Section 2 describes 
an algorithm for building quantum circuits for diagonal compu- 
tations which is analogous to an unoptimized version of classi- 
cal two-level synthesis of logic functions. This algorithm pro- 
duces 0(n2") gates with a single ancilla qubit and 0(n 2 2") gates 
else. Section 3 outlines how to use Lie theory [9] to recog- 
nize when U n G A(n) factors as a tensor on line «, i.e., case 
U„ = U„-\ ®R z (a) for U n -\ G A(n - 1). Section 4 motivates 
and describes the recursive construction of the circuits of Theo- 
rem 1.3. Finally, Section 5 discusses dimension counts required 
for the lower bounds proving that our circuit diagrams are gener- 
ically asymptotically optimal. Appendix A gives a construction 
similar to that of the Theorem, using (n — 1 )-controlled R z gates. 

Finally, some mathematical background beyond that usually 
associated to the quantum computing literature [ ! 2] is required 
to understand the arguments in this manuscript. The constructive 
synthesis algorithm makes use of the Lie theory of commutative 
matrix groups [9]. The argument for stable lower bounds makes 
use of the theory of smooth manifolds as is commonly treated in 
differential topology [5]. 



2 Prior Work 

Circuits with measurement gates of Hogg et al. 

Hogg et al. [7] consider synthesis of quantum circuits for diago- 
nal computations from a much different perspective. Their main 
result is polynomial-size circuits, but in somewhat different cir- 
cumstances compared to our work. 

• The diagonal computations U = Y!J=o u j\j)U\ t0 which the 
prior result applies are required to have many uj repeat. In- 
deed, accounting for the global phase, one supposes a fam- 
ily of diagonal computations {U„ = Tfj=o u n.j\j) (j\ ; n > 1} 
where #{u„j 7^ 1 ; < j < 2" — 1} scales as some polyno- 
mial p(n). 

• Moreover, the algorithm chosen in later steps depends on 
outputs of measurements of the quantum memory state in 
earlier steps. In the construction of classical circuits, the 
gate count would be increased by at least one MUX (if-then- 
else) gate for each classical branching, and each unique uj 
contributes such a branching. The presense of measurement 
gates moreover takes their algorithm out of the present con- 
text of reversible gate libraries. 

• The circuits ibid, would be large on a generic input of 
(g>" =l R z ((Xj) due to little repetition in the input phases. 
Thus, a separate section [7, §4] describes a precomputation 
to determine whether an input is of the form ©" =1 ^ z (a ; ). 
If this is the case, one should instead choose the tensor 
diagram. In contrast, given an input U — ©" =1 iJ z (a ; ), 
our output circuits automatically contain several cancelling 
controlled-nots's. After cancellation, one recovers the input 
tensor. 

Despite these caveats, the citation above does include some of 
the discussion of the next subsection. 
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Figure 2: For 5 = {1,2} C {1,2,3}, this figure shows at left 
[X &s <8> 1] oAji.23}(V) o [X &s (g> lj. At right is the first reduction 
of this circuit in a common implementation [ I ] . 

Analogies to classical two-level logic 

We briefly recall classical two-level synthesis in order to contrast 
our circuits with this technique. Thus, let F2 denote the field of 
two elements, and b 6 F2 also denote either a Boolean value or 
an integer of {0, 1}. In this section, let b = b\b 2 •••£>„€ (F2)' 1 de- 
note an n-bit string. Suppose (p : (F2)" — ► F2 describes an «-to-l 
Boolean function we wish to realize with a circuit in the classical, 
irreversible AND-OR-NOT gate library. A textbook technique [6] 
is the two-level approach. Briefly, take b\,b2, ■■■b„ as variables, 
and let c = c\c 2 ■ ■ -c„ G (F2)" be a fixed bit string with (p(c) = 1. 
Denote by 8 e the indicator function of c, i.e., 8 € : (F2)" — > F2 has 
8c (c) = 1 and 8g(b) = for b j^c. Then we have 

8g= [N0T eiffil (fe 1 )]AND[N0T c ' 2ffil (/72)]AND---[NDT c '» ffil (Z7„)], (5) 

where the AND gates are equivalently multiplication in F2. More- 
over, if {c\,C2,--- ,q} = {b e (F2)" ; cp(fo) = 1}, then the ex- 
pression 

q> = 8cj OR 8c 2 OR • • • OR 8 ff (6) 

provides an AND — OR — NOT circuit. For generic (p with I fts 2"~ 1 , 
note this classical two-level circuit requires 0(2"^ 1 ) gates. 

Optimizing such two-level circuits is NP-hard [6], and the 
problem has been studied extensively since the late 1960s. Al- 
gorithms and tools for this problem, e.g. Espresso, are widely 
known, and some are used in commercial circuit design tools. 
More recently, two-level decompositions in the AND-XOR-NOT 
gate library have been introduced. This is still universal, as 
any b\,b 2 G F 2 have {b\ OR b 2 ) = b\ © fo 2 © (b\ AND bj). 
Publicly available tools for such ESOP-decomposition include 
EXORCISM-4 [11, 15]. We mention this transition OR 1— > XOR 
as it is loosely analogous to our change in strategy from Section 
A to Section 4. Other work on ROM-based quantum computation 
[16] has also made use of XOR based two-level synthesis. 

We extend these ideas to build a simple circuit for U = 
Lj=o u j\j)(j\ costing 0(n2") elementary (or basic) gates. Recall 
standard notation uses A k (V) to denote a k— controlled V gate for 
V E U (2 1 ) [ 1 ] . We extend this notation slightly, in view of this 
section and Appendix A. 



Definition 2.1 In n-qubits, let 5 C {1,2, ■■■n - 1} and V e 
U(2 l ). Then As(V) denotes the particular instance of A#s(V) 
controlled by lines {j g S} and acting on line n. 

Definition 2.2 In n qubits, let S C {1, 2, • • -n - 1}. Then S s : 

(F 2 )" -> F 2 is given by 8 s (j) = 1 iff [{j ± n) and (j G S)}. For 
X = |1)(0| + |0> < 1 1 a Pauli-X gate, we write X 5 * = !gi"X 5 sU), 
If 0<y'< N/2- 1, then S{j) C {1,2, • • -n - 1} is the subset 

S (j ) - {0 < k < N/2 - 1 ; c k = 1 for j = c = c , c 2 ■ ■ ■ c„ _ 1 } 

Finally, for S C { 1 , 2, • • • « — 1 }, the number k(S) is that integer k 
such that S = S(k). 

We now detail one construction of a circuit for U = 
l!JZo uj\j)(j\. Let < k < N- 1, and k = Omod 2. Let V k = 
H;t|0)(0| +H/t+i|l)(l|, a one-qubit computation. Label 

U k = u k \k)(k\ + u k+1 \k+l)(k+l\+ £ |;)(;| (7) 

Then we have the following expression. 

U k = [I 8 W) ® 1] A {1)V . in _ 1} (F*) [X 5 WD ® 1] (8) 

Moreover, all such U k commute. Thus for any enumeration of 
subsets Si,-- - ,S N n C {1,2, • • •« — 1}, 

U = [X 5 ^ ® 1] A {1)V .,„_ 1} (^ (Sl) ) [X 5i 'i ® l]o 

[X &s 2 ® 1] A {U2 ,..., n _ 1} (V k{S2] ) [X &s 2 ®1] 0...0 ( 9) 
[X 5s ^ g,l] A {1 , 2 ,...,„_ 1} (V t{%2) ) [X 5 ^®1] 

This directly produces a quantum circuit built out of subblocks 
such as the one illustrated in Figure 2. 

Before passing to the asympotitcs, we note an optimization. 
A Grey code [12, §4.5.2] produces a sequence Si, 52,53 • • -5^/2 
with #(S k r\S k+ i) = 1, 1 < k < N/2 - 1. Sample Grey codes are 
recalled with n — 1 = 1, 2,3, where we write k(S) for each subset: 



0,1 

00,01,10,11 

000,001,010,011,111,110,101,100 



(10) 



By using a Grey code in the choice of enumeration of the sub- 
set for equation 9, we obtain a massive cancellation of inverters 
leaving only N /2 such X gates. 

Figure 2 recalls the remaining facts justifying the 0(n2 n ) gate 
count for this synthesis algorithm. Namely, each of the N/2 com- 
putations A n -i(V) require 0(n 2 ) basic gates absent an ancilla 
qubit or 48n — 164 basic gates with an ancilla qubit present [I]. 
Summing produces asymptotics of 0(n2 n ) elementary or basic 
gates with the ancilla present and 0(n 2 2 n ) gates without the an- 
cilla present. In contrast, the circuits of Theorem 1.3 described 
in Section 4 require no ancilla and cost 0(2") gates. 
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3 Tensors and characters 

The recursive proccess of the two new synthesis algorithms for 
diagonal quantum computations in Section 4 and Appendix A 
both rely on well-known ideas from Lie theory [9]. Specifically, 
it is typical to study Lie groups and most especially commutative 
Lie groups using their character functions. For G a Lie group, a 
character is a function % : G — > C — {0} with %(gh) = %(g)%{h). 
The motivating example is the following group and character. 



G 
X 



- GL(n,C) = {M n x n complex matrix ; 3M 1 } 
: det : GL(n, C) -^C-{0} 



Note that for any character, \og%(gh) = \og%(g) + log%(h) and 
by continuity \og%(g a ) = a\og%(g) for g,h £ G,a £ R. This will 
be useful in the sequel. 

We seek an obstruction r| to writing U„ £ A(n) as {/„_] ® 
R Z ((X), written in terms of characters. First, let us classify which 
diagonal U„ may be written in this way. 

Proposition 3.1 (cf. [2, §2.2]) Let U = l!J^ uj\j) (J\. Then 

there exists V = L^io 1 v j I J) U I inA(n—l) and W = wo |0) (0| + 
w\ |1}(1 1 a one-qubit diagonal so that U = V ® W if and only if 



-1 



-l 



UQU l = U 2 Uj 



■ H4W5 1 



WJV-2%_1 



(ID 



Proof: The check that such a tensor satisfies the chain of 
equalities is routine. For the opposite implication, let U = 
l!J~0 Uj\j)(j\. Then define the W = M |0)(0| + Mi|l){l|. Now 
U being unitary demands uo ^ 0. Thus, choose in the expression 
for V that vo = l,vi = u 2 /uo,v 2 = 114/110, ■ ■ ■ ,Vjv/2_i = Mjv-2/wo- 
The chain equality then implies U = V (g) W. □ 

We now introduce the language for our obstruction T|. Note 
that corollary 3.3 motivates these technical terms and is crucial 
to the constructions of Section 4 and Appendix A. 

Definition 3.2 Let U = Y!j=o u j\J){J\ define coordinates on 
A(n). For 1 < j < N/2 — 1, we define character functions 
Xj : A(n) -> C - {0} by %j(U) = Uzj^u^UyUzj+i. For U £ 
A(n), we define the vector valued function r| : A(n) — > R^/ 2-1 
byTl(C7) = -i [log Z i(C/) log Xz{U) ■■■ log^/2-1 (U)]'. Here, 
the superscript denotes the transpose of the typeset row vector, 
so that we follow that the usual convention of linear algebra that 
vector-valued functions output column vectors. 



Hence, the function T|(— ) is a quantitative obstruction to writ- 
ing U as a tensor on the last line. A heuristic for the algorithms 
of Sections 4 and A would then be the following. 

1. Define a large enough set of parameter dependent circuit 
blocks in A(n) so as to control all N/2 — 1 degrees of free- 
dom of r). Note this number of degrees of freedom coin- 
cides with the number of nonempty subsets of the top lines 
{l,2,---,n-l}. 

2. Use the previous construction and the properties of T| to ap- 
pend circuit blocks to U so that r| = 0. Then the composition 
U = V ® W, with W some R z (a) gate up to global phase. 

3. Recurse on V. 

In terms of this heuristic, the circuit blocks of Section A are the 
usual conditioned gates Ak[R z (a)] [ I ], while Section 4 requires 
a variant XOR-controlled rotation. We denote this lt [/?,(a)], 
in analogy to the A of Ak[V] being an enlarged version of the 
propositional logic symbol A for AND. 



4 Synthesis using k [R z {ol)] 

This section describes our main synthesis algorithm. Certain 
proofs are omitted due to their similarity to results of Appendix 
A. This appendix may be read first independently in order to 
motivate the constructions in this section. 



Circuit blocks for ® k {R z (a)} 

We begin by making precise the notion of a fc-fold XOR- 
controlled one-qubit computation V £ 1/(2 ). Several circuits 
blocks holding 2£ + 1 elementary gates are associated with this 
for V = R z (ol). Thus we first describe the (k + 1 )-qubit compu- 
tation, then highlight a circuit optimized for cancellation in our 
application, and finally describe possible variant circuit blocks. 

Definition 4.1 Let k > 1, V £ U(2 l ) a one-qubit quan- 
tum computation, and for b\, b%, ■ ■ ■ , bk+ \ £ F2 let the bit-string 
b\bi ■ ■ ■ bk+\ also denote the element of Z with this binary repre- 
sentation. Then the XOR-controlled V-computation controlled on 
lines 1,2,- •• ,/fcand acting on line it +1 is that © k (V) £ U(2 k+1 ) 
which extends linearly from 



Corollary 3.3 The function r) : A(n) 
properties. 



has the following [0 (y )] \b { b 2 ■■■h 



+ 1/ 



• [U = V ®W forV £ A{n - l),W £ A{\)\ [\\{U) = 0] 

• For Ui, U 2 £A(n), wehavei\(U 1 U z )=T\(Ui)+T\(U 2 ). 

• For U £ A(n), a £ R, we haveT[{U a ) = a r|(t/). 



\bi—b k )®V\b k +i), if 
bi®b 2 ®---®h = 0£F 2 

\bib 2 ---b k )®y*\b k+x ), if 

bi@b 2 ®---®b k = \ GF 2 

(12) 

Here, V* £ U{2 ) is the inverse or adjoint operator to V and the 
symbol denotes the exclusive-OR operation which is also ad- 
dition in F2. We take the convention that 0o(^)|^i^2 • • 'bn) = 
\b\b 2 ■ ■ -b n -i) ® V\b n ). In n qubits, should S £ {1,2,- • • ,n— 1} 
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be a possibly empty subset, we write s (V) for the instance of 
0#s(V) conditioned on lines {j 6 S} and acting on line n. 

In the application, we will use the circuit diagram for 
fe [i? z ((X)] which follows from the following equation. Let 
Sc. {1,2, •■ ■ ,n — 1}, say nonempty, label S = {si,S2, ■ • • ,Sk}, 
and finally let 1 G U (N/2) denote an (n — l)-qubit identity com- 
putation. Recalling the controlled-not notation C- from the In- 
troduction one has 

s [R z (a)] = C* C\ - CJ 4 , q [1 ® * z (oc)]C» q , • • • C? 2 C, 

(13) 

All controlled-not gates to either side of the 1 <E)R z (a) term com- 
mute. The right hand side of figure 3 illustrates the corresponding 
circuits. These circuits require 2k + 1 elementary gates and are 
the implementation of fc [/? z (a)] used in our final circuit dia- 
grams. For completeness, we briefly note possible variant circuit 
blocks of the same size. 

Let S C {1, • • • ,n — 1} and S ^ 0. Suppose S = {s\ , ■ ■ ■ s^} with 
s\ < S2 < ■ ■ ■ < Sk. Then another quantum circuit for s [/? z (a)] 
arises from 

This is illustrated to the left in Figure 3. 

Finally, although the controlled-not gates in the second dia- 
gram corresponding to the alternate Equation certainly do not 
commute, one may reorder the circuit in a certain sense. Let o 
be a permutation of { 1 , • • • , k}, retaining S = {s\ < S2 < ■ ■ ■ < 

ffi \R (all - d ^ ' ■ » C -« [1 ^ (a)] (15) 

See the left hand side of Figure 3. 

Computation of r|( ©s[/? z (a)] ) 

We find it more convenient to use mathematical notation for vec- 
tors such as values of T| rather than the bra-ket notation. We 
briefly recall the appropriate conventions, treated in more detail 
in Appendix A. 

Definition 4.2 For 1 < j < N/2— 1, let ej denote is the column 
vector in M. N / 2 ~ 1 with a single entry of 1 in the j row and all 
other entries 0. The vectors vj = ej — ej+\ if 1 < j < N /2 — 2, 
while v'o = —e\ and v N /2-\ = e N /2-i- 

The vectors {vj ; 1 < j < N/2 — 1} form a basis for K 2 " 1-1 . 
We need one more definition before computing T|( ©j[/? z (a)] ). 
Definition 4.3 Let S = {si,s 2 ,--- C {1,2, ■■•,«- 1} 
be nonempty. In n qubits with N = 2", let 1 <j<N/2 — \ 
with binary representation j = b\b2 ■ ■ -b n -\ for b\ ,Z>2, ■ ■ ■ ,£>»-i € 
F2. Then we say the integer j is XOR-S-conditioned iff 
b si © b S2 ®---®b Sk = l.We further define the set 

T(S) = {l<j< N/2 - 1 ; j is XOR-5-conditioned} (16) 



By a^;p itafe of 5, we mean any j <E T (S), i.e., S-flip is an 
abbreviation of XOR-S-conditioned. 

Example 4.4 Consider the special case of n = 4 qubits. The 
flip states of each nonempty subset of {1,2,3} of the top three 
lines are given in the table below, in binary. 



subset 


flip states 


{1} 


100, 101, 110, 111 


{1,2} 


010,011, 100, 101 


{1,3} 


001,011, 100, 110 


{1,2,3} 


001,010, 100, 111 


{2} 


010,011, 110, 111 


{2,3} 


001,010, 101, 110 


{3} 


001,011, 101, 111 



Note that for any S ^ 0, exactly half of the eight integers 
0, 1 , • • • ,7 are elements of J (S). O 

Proposition 4.5 Let T (S) be the set of flip states of any 
nonempty S C {1,2, ■••,« — 1 }. Then 

n(0 s &(a)]) = -2a £ vj (17) 

Also, for 5 = 0, T\[l®R z (a)] =0. 

The proof is similar to that of Proposition A. 3. However, 
©s[/? z (ot)] never leaves any computational basis state fixed, 
which accounts for the factor of two. 

Example 4.6 Consider n = 4 qubits for the subset S — {1,3} 
and a arbitrary. For convenience, label (j) = —a/2, so that 
R z (a)= e^ |0){0| +e^' l ' ) |l)<l|. We leave it to the reader to check 
that V = S [^-(oc)] is diagonal and merely describe the multi- 
ples on each computational basis state. 



state 


mult 


state 


mult 


state 


mult 


state 


mult 


|0000) 


e 4 


|0100) 




|1000) 


e -«p 


|1100} 


e -«p 


|0001) 


e -«i> 


|0101) 




|1001} 


e i(p 


11101} 


e'* 


|0010) 


e -«i> 


|0110) 




|1010} 


e i(p 


11110} 


e'* 


|0011) 


e «p 


10111) 


e'* 


11011} 




11111} 


e -m> 



Thus, Xi(V) = e 4 '*, Xz(V) = e" 4 '*, Z3 (V) = e 4 '^, Z4 (V) = 1, 
X5 (y) = e - 4 '*, Xe(V) = e 4 '*, and % 7 (V) = e" 4 '*. Thus we have 
computed r|( @ {h3} [R z (a)] ) =4(j);[l - 1 1 - 1 1 - 1]'. 

On the other hand, flip states for {1,3} are given in binary by 
y= 001,01 1,100, and 110. So f (S) = {1,3,4,6} and 



(e l -e 2 ) + {e 3 -e 4 ) + {e 4 -e 5 ) + (e 6 -e 7 ) = [l -110-11 -1]'. 
This concludes the example. O 
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Figure 3: Shown at center is a symbol due to the authors for denoting XOR control. At right are circuits for © s [/?,(oc)] per Equation 
13, as used in the circuits diagonal computations. Here, n = 4 qubits and S = {1,3} C {1,2,3}, so this is an instance of 2 [/? ; (a)]. 
At left are possible variaint circuits per Equations 14 and 15, where a is an identity permutation and o is the flip permutation of two 
elements. 



/t [7? z (a)]-block synthesis algorithm 

The —0.5 radians in the Definition of the following matrix can- 
cels the 2 coefficient in Equation 17, so that the resulting matrix 
has all entires in Z. It is similar to Definition A. 5. 

Definition 4.7 The matrix r\® is the (N/2 - 1) x (N/2 - 1) real 
matrix defined as follows. Order nonempty subsets Si, S2, •••> 
Sff/2-i in Grey order, omitting the empty set. Then for 1 < j < 

N/2 - 1, the 7 th column of r|* is T|( ® Sj [R z (-0.5 radians)] ). 

Example 4.8 Computing the four-qubit case of T|® is most 
quickly accomplished using the table of example 4.4 and Propo- 
sition 4.5. The Grey order of nonempty subsets of {1,2,3} is 
{3},{2,3},{2},{1,2},{1,2,3},{1,3}, {1}. Thus the Definition 
in this case states 



V 







1 



1 -1 






-1 




1 0\ 



-1 

1 1 

-1 



1 oy 



(18) 



O 



The fifth column recalls example 4.6. 

The matrix T|® has the following application. Note the right 
hand side is matrix multiplication with the column vector a. 

Lemma 4.9 Fix n qubits, with N = 2". Let S = [0C1 • • ■ Otjv/2-i]' 
be a vector of angles, < Uj < 2n, 1 < j < N/2 — 1. Then for 
Si, S2, ■ ■ ■ Sjv/2-l tne Grey ordering of the nonempty subsets of 
the set of top lines { 1 , • • • , « — 1 }, 

Tl(es,[K*(ai)] ■■■®s n/2 _ 1 M<Xn/2-i)}) = -2ifa (19) 

The proof is quite similar to Lemma A. 6. It uses Proposition 
4.5 and properties of T| ( — ) following from each component being 
a character. 

We now state the synthesis algorithm. It is critical in the fol- 
lowing that T|® be invertible. This result will be proven in the 
next subsection. 



XOR-Controlled Rotation Synthesis Algorithm Let U = 

Lj=o u j\j) U\> f° r which we wish to synthesize a circuit dia- 
gram using © t [/?,(a)] blocks. Label Si, S2, S3 ...S^/2-1 the 



nonempty subsets of the top lines { 1 , 
der. 

1. Compute \j/ = r\(U). 

2. Compute the inverse matrix (r|®) _1 . 



, n — 1 } in the Grey or- 



3. Compute 6c = 
tor. Label a 



(— 1/2) (r|®) 'xj/, treating ij/ as a column vec- 
[oci ■■■a N/2 -i] t . 



4. Compute the diagonal quantum computation 

= © Sl [R,(-ai)] ••• ®s N/2 jRz(-a N/2 -i)] U As 
is verified below, U is a tensor. 

5. Use the argument of Proposition 3. 1 to compute U = V <g> W 
for V G A(n — 1) and W = e l ®R z (ao) for some angle oco- 

6. Given prior computations, the following expression holds: 

U 



©»[* t (oo)] © Sl [/f,(ai)] ■■■ 



Here, 1 denotes the trivial computation of U(2 1 ) 
®v>[R z (ao)] means 1 ®R z (ao) for 1 G U{2 N I 2 ). 



(20) 
Also, 



7. Decompose each Q) k R z (a) into elementary gates using the 
circuit diagrams at the right of Figure 3. 

8. Using the Grey order and C"C'l = C/C n -, cancel all but one 
controlled-not between consecutive R z (a) gates in the re- 
sutling diagram. 

9. The algorithm terminates by recursively producing a circuit 
diagram for V G A(n — 1). 

Example 4.10 Consider the following 3-qubit computation: 



U = e' te ''/ I2 |0)<0|+e 2 ' t ''/ 12 |l){l|- 
e 9,l ''/ 12 |2}(2|+e 7,ll / I2 |3)(3|- 



,3jb'/12 



4)(4|- 



,11 



m '/ 12 |6)(6| 



-e 8,ti / 12 |5}(5| + 

fe lto//12)| 7 ^ 7 | 



(21) 
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We apply the synthesis algorithm above to U. 

We begin by computing the 3-qubit case of T|®. The Grey 
order is {1}, {1,2}, and {2}. 



(22) 



The inverse matrix appears in the algorithm and may be reused 
for multiple diagonal computations. 



Collecting all terms, we arrive at 



U 



diag(e 



-4jIi'/48 e 4TCi/48 e -47Ii/48 4ju/48 




(30) 



1 1 

(TlV = (1/2)| 1 -1 
1 2 1 



(23) 



Now ¥ = i\(U) = -ipogXi(^) logX2(t/) log% 3 (U)}' = 
[0 7?c/12 — 67t/12] f . Thus computing the parameters for the 
s [i? z (a)] blocks, 

a = (-l/2)(ri e )- 1 ijf= [3%/ 24 -3n/24 -4n/24}' (24) 

It should be the case that the computation U given by 

{1} [*«(-3Jt/24)] {[ 2} [R z (3k/24)} {2} [# z (47t/24)] V 

(25) 

has U = V <S> W for V a two-qubit diagonal and W a one-qubit 
diagonal. We verify this by computing matrix coefficients for U. 

In the following computation, for given R e A we abbre- 
viate R = Y!}^rj\j){j\ as R = diag(r ,n , • • • ,r N -i) in order 
to save space. The first step in computing U is to compute 
0{!} [R z (4n/24)]. Begin by noting that 

1®1®* z (4jc/24) = diag(e- 4lll '/ 48 ,e 47[, '/ 48 ,e- 4lt ''/ 48 ,e 4111 '/ 48 , 

e — 47ti'/48 e 4j[i'/48 e -4jt//48 e 47Ci/48\ 

(26) 

Associating the entries with |000), |001), etc., we reverse those 
pairs \b1b2b3) with the binary integer bib2 € ({1})- 

©m&(4Jt/24)] = diag(e- 4,t ''/ 48 ,e 47lI '/ 48 ,e- 4l[ ''/ 48 ,e 471 ''/ 48 , 



e 47ti/48 e -4TC//48 e 47Ci/48 e -47Ci/48) 

diag(e _3 ' t ' / ' 48 e 3 *'/ 48 e 3 ™'/ 48 e^ 3 ™/ 48 

e 3TC('/48 e -37t//48 e -37C»748 e 37B'/48\ 

diag(e 37r!//48 e -3 ™'/ 48 e ~ 3lCi '/ 48 e 3lt '/ 48 

e 3ic//48 e -37ti/48 e -3jn'/48 e 3ici/48^ 

diag(e 4lt '/ 12 e 8lt '/ 48 e 36jCi / 48 e 28lt! / 48 

e 12w'/48 e 32ici/48 e 447C//48 e 40m'/48^ 

diag(e 1271 '/ 48 e 12 " 1 / 48 e 3211 '/ 48 e 32 ' 11 / 48 

e 227C//48 e 22m/48 e 42ici/48 e 42;7r/48\ 



Thus # = diag(e 12ll ''/48 i e32W48 je 227 [ ,748 ie 42m748 ) (g, diag(l , 1 ). 
The odd happenstance that the latter tensor factor is an identity 
saves one gate. 

Next, write out circuit diagrams for each s [/? z (a)] per the 
right hand side of Figure 3. Since the chose the Grey order 
{1},{1,2}, {2}, cancelling controlled not gates produces the 
leftmost 8 elementary gates of figure 1. Finally, call the algo- 
rithm recursively on V. The two-qubit case coincides with other 
work [2, §2.2]. O 

Proof of Correctness 

We briefly verify that U — V ® W. First use Proposition 4.5 for 
n(0 Sl [^(-ai)] ••• V9 i [ J R,(-a w/2 _ 1 )]) = 2 11 @ a (31) 

Now by definition a = (-1 /2)(j\ e )~ l \\f, so that 2r| e a = — ij/. 

11(05, fc(-ai )] ••• 5w/i ^(-aN/2-i)] H-¥ (32) 

Then the property r|([/i£/ 2 ) = T|(£/i) +r|(f/ 2 ) demands 

l(0 Sl [**(-«!)] -0s w/2 [ [^(-aw/2-i)]f/) = -¥+¥ = 

(33) 

So by the restatement of Proposition 3. 1, we have U = V <g> W. 
There is one remaining unjustified (subtle) statement to check. 



e 47ti/48 e — 4tci/48 e 4jti'/48 e -4jl// 



(27) 



We may similarly construct 0|j 2 -j.[i? z (3jt/24)]. 
© {1>2} (371/24)] = diag (e- 3 ™/ 48 , e 3ll <'/48 , e 3Jti/48 ; e -3 7[ «/48 ^ 

e 3rci/48 e -3lB"/48 e ~ 3jb"/48 e 3jci/48% 

(28) 

Finally, the flip states of {2} are j = 1,3. Thus 

© {2 }[*(- 331/24)] - diag(e 3 ' t! '/48 )e -3x i 748 )e -3 J[i /48 )e 3 JC //48 ) 



3jii/48 e -3jti/48 e -37C!/48 e 37ti/ 



(29) 



Proposition 4.11 T|® is an invertible (N/2 — 1) x (yV/2 - 1) rea/ 
matrix for n > 1. 

Sketch: It is equivalent to consider the question for an alter- 
nate basis of R^/ 2-1 . Thus, choose instead the vectors {\>j ; 1 < 
j < M^/ 2-1 } of Definition 4.2. In this alternate basis, the similar 
matrix M corresponding to T|® has an entry of 1 for the vj com- 
ponent whenever j is a flip state for the 7 -set in Grey order. 

Fix an nonempty subset S of {1,2, • ■ ■ ,n — 1}, thus fixing a 
column of r|®. We first claim there precisely 2"~ 2 flip states 
for S. To see this, observe that the equation ®kesbk = 1 satis- 
fied by S-flip states defines an affine linear F 2 subspace of the 
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finite-dimensional vector space (F2) 2 " . Then this number of 
elements corresponds to the dimension count, since any I dimen- 
sional vector space with F2-scalars must contain 2 elements. 

Next, fix S\ ^ S2 distinct nonempty subsets. Then the associ- 
ated columns of M share precisely 2"~ 3 positions in which each 
has a nonzero, unit entry. This is again a dimension count. Note 
that since 5-flip states satisfy (Bkesbk = 1, Si 7^ S2. Thus the 
codimension one subspaces corresponding to S\ and S2 intersect 
transversally in a codimension two subspace. 

Given these claims, label M = (m^) and recall the Kro- 
necker delta which is 1 for j ' — k and zero else. Now consider- 
ations of the last two paragraphs demand that for the transpose 
(real adjoint) M r , M'M = {m kJ )(mjt) = 2"- 2 (8j + 1). An omit- 
ted argument then shows ^ det(M'M), demanding (detM) 2 ^ 
0. As M is invertible and T|® is similar to M, we must have r|® 
invertible. □ 

Gate Counts 

Our circuit diagrams are built from blocks realizing s [/?,(oc)] 
at the right of Figure 3, and the choice of subsets in the Grey 
order causes a large cancellation of controlled-not gates which is 
required for the 0(2") asymptotic. We now justify the gate count 
of 2' !+1 — 3, which for n = 2 coincides with 5 gates [2, §2.2]. 

Except for the recursive call to V, the synthesis algorithm 
writes elementary gates realizing the following computition. 

0,[*z(Oo)] © Si [*-(«!)] •• •0 Sw/2 _ 1 ^(<%/2-i)] 04) 

Here, ® & [R z ((Xo)] = (e~ ,<J> )(l ® W) is the one-qubit gate result- 
ing on the last tensor factor due to zeroing the obstruction T|(— ). 
We have used the commutativity of A(n) to move this computa- 
tion to the front to preserve the full Grey order including 0. 

Now realize each of the ©s[^? z ((X)] blocks using the circuits 
at the right of Figure 3. Due to the Grey order, all but one 
controlled-not gate will cancel between any two consecutive R- 
gates on the bottom line. Thus the gate count in terms of elemen- 
tary gates from the Introduction should account for the following. 

• 2"~ 1 controlled rotations R z , since this is the number of pos- 
sibly empty subsets of {1,2, • • • ,n — 1}. 

• 2 n_1 controlled-not gates, since one lies to the right of each 
R z gate. 

Thus prior to the recursive call, in « > 2 qubits the algorithm will 
place 2" elementary gates. 

To obtain the exact count, stop the recursive count at n = 2 
qubits. 

2"+2"- 1 + --- + 8+4 = 2' !+1 -4 (35) 

The end case of recursion is for n = 1. Since any one-qubit diag- 
onal may be written e'®R z (a), the remaining one-qubit diagonal 
requires one elementary gate. Thus the grand total is 2" +1 — 3 
elementary gates. 



5 Stable Lower Bounds 

The section justifies the claim of stably-asymptotical optimality 
in Theorem 1.3 using an argument similar to one by E. Knill [10, 
Theorem 3.4]. We provide a greater level of detail and tailor the 
discussion to synthesis within a subgroup H C U(N). Our ar- 
gument is what simpler because we are dealing with elementary 
gates from the Introduction while Knill uses basic gates [1]. 
Thus let S C U(N). We introduce the following convention: 

5 = {e f *V ; 0<4><27t,V e S} (36) 

This will allow us to ignore global phases in the following dis- 
cussion. Note that A =A. 

We now expand on comments made briefly in Definition 1.2 
of the Introduction. A circuit topology 1 x is an n-line diagram on 
which is marked a sequence of gate-holders. These gate-holders 
are either controlled-not gates joining any two lines or boxes la- 
belled either Y or Z. To specialize the circuit topology x to an 
actual circuit, one chooses paramaters for either an /?_,.( 9) gate or 
an R z (ol) gate to place into boxes labelled Y or Z respectively. 
We define #x to be the total of the number of controlled-nots and 
boxes, while dimx denotes the number of boxes. Label S x to be 
the subset of all V € U (N) that result from choosing particular 
parameters for a R y (Q) gate in each Y box and an R z (ol) gate in 
each Z box. We say that x specializes stably to an analytic sub- 
group H C U(N) when S x C H. 

Lemma 5.1 Suppose x specializes stably to H and dim x + 1 < 
dim H. Then S z is a measure zero subset ofH. 

Proof: We appeal to Sard's theorem from differential topol- 
ogy [5, p.39]. Consider the map / : Rdimx+i which 
carries a tuple (Q,t\,t%, ■ ■ ■ ,f(ji mx+ i) to the e'®V which is the 
phase e ! * multiplied by the specialization of x corresponding to 
t\ ,?2, • • • , f dimT- ma P ^ s smoom . 

By Sard's theorem [5, p.39], for all but a measure zero subset 
of h G H one of the following two cases hold: 

• There is no choice of parameter v with /(v) = h. 

• For each v with /(v) = h, the derivative linear map at the 
parameter v denoted df v : R d i mx+1 — > T h (H) is onto. 

The second possibility is absurd by the dimension hypothesis. 
Thus /(R d ™ 1+1 ) is a measure zero subset of H. □ 

Proposition 5.2 Fix n, and let qbe a quantum circuit synthesis 
algorithm inputting a € A(n) and outputting stably to A(n). Then 
#q>2 n -l=N-l. 

'We discuss here circuit topologies in the elementary gate library. 
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Proof: Let C be a countable set with {x(c) ; c <E C} the set of 
topologies output by q. Now dim A (n) =N. Thus assume by way 
of contradiction #q < N — 1 . Then we may write 



A(n) = U c€ c5 T ( c ) 



(37) 



This is impossible by Lemma 5.1. Indeed, a countable union of 
measure zero subsets is still measure and hence can not cover 
A(n). □ 

Corollary 5.3 Let {^(n)} be a family of synthesis algorithms, 
each of which accepts all inputs from A{n) and outputs stably to 
A{n) per Definition 1.2. If #q(n) G 0(2"), then {^(n)} is stably 
asymptotically optimal. 

6 Conclusions and On-Going Work 

We realize quantum circuits for any diagonal U = Ylj=o u j\j){j\ 
consisting of at most 2" +1 — 3 alternating controlled-not gates 
and z axis Bloch sphere rotations on individual qubits. The con- 
struction uses a new circuit block, the XDR-controlled rotation. 
This 0(2") construction is optimal in the following sense. In the 
worst-case and also the generic case, at least 2" — 1 one-qubit ro- 
tations are required to construct such a diagonal U . Thus our con- 
structive algorithm shows that the synthesis of quantum circuits 
for diagonal computations is in fact 0(2"). Note that special-case 
computations such as tensors of one-qubit diagonal computations 
may require fewer gates. 

The circuits above have several common applications. For 
example, they are useful when constructing a circuit for a top- 
conditioned V computation given a circuit diagram for V correct 
up to relative phase. They are also needed when applying projec- 
tive measurements other than the typical ; < j < N — 1}. 
In our ongoing work, we will explore applications relating to the 
synthesis of real quantum computations and also exotic quantum 
circuit synthesis algorithms relying on KAK metadecompositions 
of 1/(2"). 

A Synthesis via Controlled Rotations 

This appendix describes a synthesis algorithm using the 
Ak[R-(a)] circuit subblocks. Recall our constructive proof of 
the upper bound on gate counts of Theorem 1.3 used jt [/?-(oc)] 
subblocks instead. Several technical issues arising in our main 
algorithm also arise here. Thus, this appendix may serve as an 
introduction of how to use the obstruction T|( — ) of Definition 3.2 
to form a recursive synthesis algorithm reducing «-qubit diago- 
nals to (n — l)-qubit diagonals. 

Computation of r\( As[R z (a)] ) 

Recall from the Introduction U = Y!J=o u j\J)(j\ for N = 2 " a 
fixed n-qubit diagonal quantum computation. Further recall that 



R- 



Figure 4: This diagram [ , Lemma 7.11] illustrates how to real- 
ize a Afc[2? z (a)] via a singly controlled rotation and A:-controlled- 
nots. The latter may be synthesized using 0(k) elementary gates, 
given the ancilla qubit shown as the top line. Without the ancilla, 
a 0(k 2 ) gates would be required per corollary 7.6 ibid. The dia- 
gram at right recalls the next step of the decomposition. 

for 5 C {1,2,- •• ,n- 1}, by A S (V) for V G U(2 1 ) we mean that 
instance of the SS-conditioned computation A#s(V) which is con- 
ditioned on lines {j G S} and acts on line n. 

Every computation As[/?-(oc)] is also diagonal. We seek an 
explicit formula for T|( As[/? : (a)] ). With sufficient understand- 
ing of how As[Z?z(a)] affects t|(— ), we will be able to choose 
exact angles a so that preprending the conditioned blocks to U 
forces the composite to have r| = 0. Thus the composite will be 
a tensor by corollary 3.3, allowing for recursion. The following 
language is useful for expressing and computing T|( As[R z (oc)] ). 
It is slightly more convenient to use the mathematical notation 
for vectors rather than bra-ket. 

Definition 1.1 For 1 < j < N/2— 1, let ej denote the standard 
basis column vectors for R^/ 2-1 , i.e., ej has a single entry of 
1 in the f row and all other entries 0. We further define the 
vectors Vj = ej — ej+i if 1 < j < N/2 — 2, also setting vo = —e\ 
andvjy/2-i =e N / 2 -i- 

Observe that the vectors {vj ; 1 < j < N/2 — 1} form a ba- 



sis for 



We need one further convention to describe 



r\(A s [R z (a)}). 

Definition 1.2 Let 1 < j < N/2 — 1, with binary representation 
} = b\b%---b n -\ foTbi,b 2 ,--- ,b n -\ g F 2 . LetSc {1,2,--- ,n- 
1}, S ^ 0. We say that j is 5-conditioned iff l\ jeS bj — 1. We 
label C(S) = {j ; j is S conditioned}. 

Proposition A.3 Let C (S) denote the S-conditioned set for some 
nonempty S C {1 , • • • ,n — 1}. Then 



Ti(As[/? z (a)]) = a £ vj 

;'ec(s) 



(38) 



Proof: Label V = A s [R z (a)} = Zj=o ^j\j)U\- We recall that 
v\(V) is defined in terms of %j(V) = X2j-2^2j-i^'^'' 1M • Now 
if j G C(S), then X 2 j = e" ,0l/2 and X 2 j+\ = e to/2 
expression for j is not S-conditioned, then A-2/ = A.2/+1 = 1. Con- 
tinuing in this manner, say the binary expression for j + 1 G C(S). 



If the binary 



10 



Then %2j+2 = e ,a/2 and X 2 j+3 = e' 01 / 2 , else \ 2j+2 = ta/+3 = L 
Thus letting Swj) denote the indicator function of C(S), 



-ilogXy(V) = a8 c(5) (;') - a5 c(s) (j + 1) 



(39) 



This expression agrees componentwise with the result of the 
proposition, given Definition 3.2. □ 

Example 1.4 Consider n = 4 qubits for the subset S = {1,3} 
and < a < 2n arbitrary. Label (j) = -a/2, so that R z (a) = 
e''*|0)(0|+e-''*|l)(l|. Since V=A s [R z (a)] is diagonal, we de- 
scribe the quantum computation by specifying multiples on each 
computational basis state. 



state 


mult 


state 


mult 


state 


mult 


state 


mult 


|0000) 


1 


|0100} 


1 


|1000} 


1 


|1100} 


1 


|0001) 


1 


|0101} 


1 


|1001} 


1 


|1101} 


1 


|0010) 


1 


|0110} 


1 


|1010} 


e'f 


|1110} 


e «p 


|0011) 


1 


10111} 


1 


|1011} 


e -i«|) 


|1111} 


e -«l) 



Thus, xi(V) = 1, Jfc(V) = 1, %3(V) = 1, % 4 (V) = 1, % 5 (V) = 
e _2 " t> , Xe(V) = e 2 "t>, and % 7 (V) = e" 2 '*. Thus we have directly 
computed that r|( A s [fl z (oc)] ) = -2tyi[0 1 - 1 1]'. 

On the other hand, C({1,3}) = {101 b , lll b } = {5,7}, where 
the subscript denotes binary. Thus 

v 5 + v 7 = (e 5 - e 6 ) + e 7 = [0 1 - 1 1]' (40) 

Thus we computed the right-hand side of Proposition A. 3. O 

A.l Ajt[/? z (oc)] -block synthesis algorithm 

Before the following definition, we note a happy accident. There 
are N/2 — 1 nonempty subsets of the top lines {1, •■■,« — 1}, 
and moreover N/2 — 1 characters %j : A(n) — ► U(l) which must 
be zeroed within the components of the obstruction T|( — ) to form 
a tensor. Thus, the following matrix is square. 

Definition 1.5 The (N/2 - 1) x (N/2 - 1) real matrix r) A is 
defined as follows. Order nonempty subsets Si, 52, ... S 7 „-i_ l in 



dictionary order. Then for 1 < j < N/2 
T| A is r|(A s . [R z (l radian)] ). 



1, the column of 



Lemma A.6 Let OL = [(X\ ■■■ a N / 2 -\]'- Then for S\, S 2 , 
•••Sn/2-1 tne dictionary ordering of nonempty subsets of 
{1, •••,"-!}, 

Tl(A Sl [/? z (a!)] A S2 [R z (a 2 )} ■ ■ ■ As [R : (ct N/2 -i)] ) =il A a 

(41) 

Here, the right hand side denotes matrix multiplication by the 
column vector a. 

Sketch: Recall that for any character % : A — > C — {0}, one has 
logx(VW) = logJc(V) +logx(W) and log%(V") = alog%(V) for 



V,W G A, a £ M. Recall Definition 3.2 and apply these properties 
to the entries —ilog%j of the vector valued function T|(— ). □ 
We now state A^^- (a)] -block synthesis algorithm for a diag- 
onal unitary computations. The proof of correctness follows in 
the next subsection and includes a proof of the subtle fact that the 
matrix r| A is inveritble. 

Controlled Rotation Synthesis Algorithm Let U = 

Y!J=o u j\j) U\< f° r which we wish to synthesize a circuit diagram 
in terms of Afc[7? z (a)] blocks. Label Si, S 2 , S3 . ..S 2 n-i_j 
the nonempty subsets of the top n — 1 lines {1 , • • • , n — 1} in 
dictionary order. 

1. Compute the obstruction \j/ = T[(U). 

2. Compute the inverse matrix (r| A ) _1 . 

3. Compute a = ("H A ) 1 vp", treating xj/ as a column vector. La- 
bel a = [oil ■■■a N / 2 -iY- 



Compute the diagonal 
U = AsM-ai)] ••• As^ 
is verified below, U is a tensor. 



quantum computation 

&(-<% /2 _i)] U. As 



5. Use the argument of prop. 3.1 to compute U = V <g> W for 
V £A(n - 1) and W = e ! *7? ; (ao) for some angle do. 

6. Given prior computations, the following expression holds: 
l/ = Afl[/?-(ao)]A Sl [/?,((Xi)] ••• A s Ma^t)} 

(42) 

Here, 1 denotes the trivial computation of U(2 ). Also, 
A [i?-(ao)] means 1 <g>R z (ao) for 1 £ U(2 N I 2 ). 

7. Techniques from the literature are then used to decompose 
each As, [R z ((Xj)] into elementary gates per Figure 4. 

8. The algorithm terminates by recursively producing a circuit 
diagram for V £A(n— 1). 

Example 1.7 

In three qubits, consider the following diagonal computation. 

e 6W6 1 ) (0 1 + e 371 ''/ 6 1 1 } { 1 1 + e 9 ™'/ 6 1 2) (2 1 + e 811 ''/ 6 1 3 } (3 1 



U 



s 5it//6 1 4 ) ^ 4 | + e i^/6| 5 ) (5 j + e 6m '/ 6 |6) (6| + 1 17) (7 [ 



(43) 

Then one has %i(U) = e 271 '/ 6 , % 2 (U) = e" 371 '/ 6 , % 3 (U) = e" 271 '/ 6 
sothat\j/ = r|(t/) = [2%/6 - 3n/6 -2jt/6f. 

We now must compute d by computing the inverse matrix 
(T| A ) _1 . For this matrix, first compute the following. 



(44) 



The following inverse matrix results, and it may be reused for 
multiple specific diagonals U. 




Of) 



A-v-l 




(45) 
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Soa=(r|) 1 \j/=[-7t/6 — 4ti/6 2tc/6]'. Hence [7 as defined 
below is a tensor. 

j/ = A {l} [R z (n/6)} A {h2} [R z (4n/6)] A {2} [^(-Jt/6)] U (46) 

In order to verify this directly, we compute the eight diagonal 
matrix coefficients of each of As[/? : (oc)]. To save space, we write 
diag(X , ■ • ■ , fa ) for fa 10) (0 1 + • • • + fa |7) (7 1 . 

A {1} [* z (7t/6)] = diag(l,l,l,l,e-™/ 12 , 

e 7ti'/12 e -7ii/12 e 7r/'/ 12 

A {1 , 2} [* z (4jt/6)] = diag(l,l,l,'l,l,l,e- 4 Wi2 ; e^/i2 ) 
A {2} [i? z (-2jt/6)] = diag(l,l,e 27t, '/ 12 ,e- 27Ci '/ 12 , 

I I e 27ii/12 e -27Ci/12-| 

(47) 

Then multiplying, the expression demonstrates U = V <g> W. 

U = diag (ei 2 ^/i 2 ,e f " c '/i2 ; e 2 0Wi2 ie i4Wi2 j 

e 9m/l2 e 3xi'/12 e 97ci/12 e 3jtiVl2\ ( 4 °) 

Since f7 is a tensor, we obtain the following decomposition of U. 

U = A{ 1} [/? z (-Jt/6)]A {li2} [i? z (-4jt/6)]A {2} [/? z (jt/6)] 

[diag(l,e 8,t, '/ I2 ,e- 37l ''/ 12 ,e- 3,t^ / 12 )®diag(e 12,t^ / 6 ,e 6,t, '/ 6 )] 

(49) 

The algorithm then recursively synthesizes the 2-qubit diagonal 

V = l|0)(0|+e 8jl ''/ 1 2|l)(l| +e - 3 ' t ''/i2|2){2|+e- 3jl ''/ 1 2)|3)(3|. O 

Proof of correctness of A^/J^cx)] -block synthesis 

We briefly verify that U = V ® W. First use proposition A. 6 for 

Tl(A Sl [/? z (-ai)] ■■■As N/2 _ 1 [R z (-a N/2 _ l )} ) = -ij/ (50) 

Then the property r\(UiU 2 ) = r((Z7i) +r|(i7 2 ) demands 

ti( As, [*,(-ai)] • • -A Sn/2 _, [R z (-a N/2 ^)] U ) = -y + y = 6 

(51) 

So by the restatement of Proposition 3. 1, we have U = V (g> W. 
The algorithm also uses the following proposition. 

Proposition A.8 77ie matrix T| A per Definition A.5 is an invert- 
ible (2"- 1 - 1) x (2"- 1 - 1) matrix. 

Sketch: It suffices instead to consider the similar matrix corre- 
sponding to a change of basis to vj, 1 < j <N/2— 1 of Definition 
4.2. Thus, if B = [vi v 2 • • • v N / 2 -i] is the change of basis matrix, 
the matrix similar to r) A is M = B X \\ K B = (rrijk). Now rrijk = 
if j is not ^-conditioned and m = 1 if j is ^-conditioned. 

M is invertible since column operations reduce M to a permu- 
tation matrix. Indeed, the last e N / 2 -i column may be used to 
clear all other nonzero entries in the last row. Then each of the 
columns corresponding to n — 2 element subsets retain a single 
nonzero entry, and the corresponding rows may be cleared. □ 
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